Refactor producer send paths to shrink future sizes #348

ChrisWrenDev · 2025-08-20T16:01:07Z

Issue: #347

This PR refactors how producers send messages to reduce the size of generated async futures and to fix unnecessary lock contention during batching.

The public API and behavior are unchanged — these are internal improvements.

Motivation
Oversized async futures (~16 KB):

Previously, Producer::send_non_blocking and related methods held on to the network send future returned by connection.sender().send(..) across an .await.
That network future captures a large amount of state internally, which forced the compiler to generate async state machines that were ~16 KB in size.
This bloated types like SendFuture and Result<SendFuture, Error> and propagated large futures into downstream code.

Unnecessary batch lock contention:

In batching mode, the code serialized every message while still holding the batch mutex.
Serialization can involve allocations and compression, so other tasks trying to enqueue messages were blocked on the lock for longer than necessary.

Changes
Introduced send_inner_retry / start_send_once helpers:

The network send future is created synchronously and immediately wrapped into a much smaller future.
No large future is ever held across an .await.
The retry loop logic for reconnects and IO errors is preserved exactly as before.

Refactored batching (send_batch and batching branch of send_raw):

Messages are drained from the queue under the lock, then the lock is released.
Only after releasing the lock are the messages serialized and compressed.
This ensures lock contention is minimal, while maintaining the same batching and receipt fan-out behavior.

Benefits

Futures are small and cheap: SendFuture itself remains just 8 bytes, and callers of send_non_blocking no longer cause giant state machines to be generated.
Better concurrency: The batch mutex is only held for lightweight queue operations. Other producers can continue pushing messages while serialization happens.
Behavior preserved: Compression, retries, batching, and public APIs are unchanged.

Testing

Verified with -Zprint-type-sizes that the generated futures are now small (hundreds of bytes instead of ~16 KB).
Verified against Risingwave locally that clippy::large_future errors are resolved.

…e serialization Signed-off-by: chris wren <[email protected]>

Signed-off-by: chris wren <[email protected]>

ChrisWrenDev · 2025-08-20T16:07:13Z

All of the changes in send_non_blocking could alternatively be simplified by boxing the network future. That would avoid the refactor but introduce a heap allocation on every send.

I kept the current approach to eliminate the allocation and keep futures lightweight, but I’m open to reverting to the boxed-future solution if reviewers prefer the safer path. Given that Producer currently lacks direct tests, it’s harder to guarantee no behavioral regressions from the refactor.

ChrisWrenDev · 2025-08-20T16:12:49Z

@freeznet tagging you for review. Thanks!

BewareMyPower · 2025-08-27T02:42:43Z

but introduce a heap allocation on every send.

IMO, heap allocation is acceptable.

BTW, could you use another PR to improve the batch lock contention? It seems to be a separated issue with the large future size.

BewareMyPower · 2025-08-27T03:22:57Z

src/producer.rs

+        let f = async move {
+            let res = fut.await;
+            res.map_err(|e| {
+                error!("wait send receipt got error: {:?}", e);
+                Error::Producer(ProducerError::Connection(e))
+            })
+        };


Is this necessary to reduce the future size? After putting ensure_connected().await before this method, the connection send future won't be held across an .await.

BTW, could you share more details about the outputs of build with the -Zprint-type-sizes flag? i.e. which future's type size is large, as well as the size after the change? It would be helpful to verify if any regression could be introduced in future.

By doing ensure_connected().await before creating the send future, then creating the send future in a sync function start_send_once and returning a thin adapter (e.g., Ok(async move { fut.await.map_err(..) })), the outer async fn no longer holds the big future and now only returns a small wrapper.

I will rerun -Zprint-type-sizes on the current branch and on main to regenerate a before/after comparison to illustrate the changes.

Signed-off-by: chris wren <[email protected]>

… master Signed-off-by: chris wren <[email protected]>

ChrisWrenDev · 2025-08-28T17:00:24Z

After investigating with -Zprint-type-sizes, I couldn’t observe meaningful type size differences directly in the pulsar-rs crate, but in RisingWave I was able to see sizes. Interestingly, the reported sizes were under Clippy’s large_futures threshold of 16 384 bytes, and my earlier changes actually made the future slightly larger—even though they silenced the Clippy warnings.

This seems to be due to the way Clippy estimates future sizes, which can yield false positives. (See discussion in risingwavelabs/risingwave#22971)

After stripping back my earlier refactor, I found that the only change needed to eliminate the downstream Clippy large_futures warnings in RisingWave was making send_compress synchronous. Since compression is CPU-bound and doesn’t benefit from async, this removes an unnecessary state machine in the call chain. While not required for actual size reduction, it silences Clippy downstream and makes the code simpler.

Before:

type: `{async fn body of sink::pulsar::PulsarPayloadWriter<'_>::send_message()}`: 16184 bytes, alignment: 8 bytes
print-type-size         local `.__awaitee`: 15656 bytes, alignment: 8 bytes, 
type: {async fn body of pulsar::Producer<pulsar::TokioExecutor>::send_non_blocking<pulsar::producer::Message>()}

After:

type: `{async fn body of sink::pulsar::PulsarPayloadWriter<'_>::send_message()}`: 16728 bytes, alignment: 8 bytes
local `.__awaitee`: 16200 bytes, alignment: 8 bytes, type: {async fn body of pulsar::Producer<pulsar::TokioExecutor>::send_non_blocking<pulsar::producer::Message>()}

Compression Only:

type: `{async fn body of sink::pulsar::PulsarPayloadWriter<'_>::send_message()}`: 16056 bytes, alignment: 8 bytes
local `.__awaitee`: 15528 bytes, alignment: 8 bytes, type: {async fn body of pulsar::Producer<pulsar::TokioExecutor>::send_non_blocking<pulsar::producer::Message>()}

BewareMyPower

I opened a PR to support batch timeout: #354, this PR abstracts a synchronous compress_message function as well.

It involves much code refactoring so that it will have conflicts with this PR, I'd like to hold this PR for a while. Feel free to review my PR as well.

ChrisWrenDev added 2 commits August 20, 2025 15:26

refactor(producer): shrink async futures and release batch lock befor…

b649482

…e serialization Signed-off-by: chris wren <[email protected]>

test(producer): unit tests for batching and message conversion

e6044e6

Signed-off-by: chris wren <[email protected]>

ChrisWrenDev mentioned this pull request Aug 20, 2025

Large future issues in Pulsar and Kafka crates risingwavelabs/risingwave#22914

Open

4 tasks

BewareMyPower assigned ChrisWrenDev Aug 26, 2025

BewareMyPower reviewed Aug 27, 2025

View reviewed changes

This was referenced Aug 27, 2025

test(producer): add unit tests for producer.rs #350

Merged

refactor(producer): remove unnecessary lock contention #351

Merged

ChrisWrenDev force-pushed the fix/producer-small-futures branch from f413611 to ebb57e6 Compare August 27, 2025 15:34

chore(producer): revert unit tests and lock contention changes

14f8ac2

Signed-off-by: chris wren <[email protected]>

ChrisWrenDev force-pushed the fix/producer-small-futures branch from ebb57e6 to 14f8ac2 Compare August 27, 2025 15:36

ChrisWrenDev added 2 commits August 28, 2025 16:42

chore(producer): reverted ensure_connected and start_send_once helpers

1c2e218

Signed-off-by: chris wren <[email protected]>

chore: resolve merge conflicts between fix/producer-small-futures and…

3268d17

… master Signed-off-by: chris wren <[email protected]>

ChrisWrenDev force-pushed the fix/producer-small-futures branch from c149c34 to 3268d17 Compare August 28, 2025 15:46

ChrisWrenDev mentioned this pull request Aug 29, 2025

test: add recv_within helper and timeouts to prevent hangs in consumer tests #352

Merged

Merge branch 'master' into fix/producer-small-futures

b47b489

BewareMyPower reviewed Sep 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor producer send paths to shrink future sizes #348

Refactor producer send paths to shrink future sizes #348

Uh oh!

ChrisWrenDev commented Aug 20, 2025

Uh oh!

ChrisWrenDev commented Aug 20, 2025

Uh oh!

ChrisWrenDev commented Aug 20, 2025

Uh oh!

BewareMyPower commented Aug 27, 2025

Uh oh!

BewareMyPower Aug 27, 2025

Uh oh!

ChrisWrenDev Aug 27, 2025

Uh oh!

ChrisWrenDev commented Aug 28, 2025

Uh oh!

BewareMyPower left a comment

Uh oh!

Uh oh!

Refactor producer send paths to shrink future sizes #348

Are you sure you want to change the base?

Refactor producer send paths to shrink future sizes #348

Uh oh!

Conversation

ChrisWrenDev commented Aug 20, 2025

Uh oh!

ChrisWrenDev commented Aug 20, 2025

Uh oh!

ChrisWrenDev commented Aug 20, 2025

Uh oh!

BewareMyPower commented Aug 27, 2025

Uh oh!

BewareMyPower Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

ChrisWrenDev Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

ChrisWrenDev commented Aug 28, 2025

Uh oh!

BewareMyPower left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!